Extraction of Complex Relations in Humanistic : Statistics, Itemsets and Association Rules
Identifieur interne : 004F12 ( Main/Exploration ); précédent : 004F11; suivant : 004F13Extraction of Complex Relations in Humanistic : Statistics, Itemsets and Association Rules
Auteurs : Martine Cadot [France]Source :
Descripteurs français
- mix :
- apprentissage artificiel, codage et recodage des données., extraction de connaissances, fouille de données, fouille de textes, interaction statistique, motifs, motifs flous, nettoyage et prétraitement des données, règles d'association, règles floues, significativité statistique, test de randomisation.
English descriptors
- mix :
Abstract
This thesis is about of Data Mining in Humanistic. This branch of Artificial Intelligence is a set of methods for extracting knowledge from electronic data. Among them, the itemsets and association rules extraction is a method to build a symbolic representation of the data structure, like the classical statistical methods makes, but, unlike these ones, it can work with complex and huge data. Therefore, this computer science model, obtained by counting of cooccurrences, is not easily used by scientists : it works with dichotomics data (True/False), the interpretation of its direct results is difficult, and its validity can seem of doubt for researchers working with statistics. We propose three techniques we constructed and experimented on real data to facilitate the use of the itemsets and association rules extraction by scientists : 1) With our randomisation test based on " exchanges in cascade " in the matrix subjects x properties, one can obtain the statistically significant links between properties 2) Our fuzzification of the itemsets and association rules extraction produces fuzzy association rules close to the fuzzy rules defined by researchers of fuzzy community around Zadeh 3) With our algorithm Midova one can only extract interactions, and 4) With our meta-rules, one can clean the association rules set of its principal contradictions and redundancies
Url:
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Hal, to step Corpus: 002186
- to stream Hal, to step Curation: 002186
- to stream Hal, to step Checkpoint: 003E24
- to stream Main, to step Merge: 005077
- to stream Main, to step Curation: 004F12
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Extraction of Complex Relations in Humanistic : Statistics, Itemsets and Association Rules</title>
<title xml:lang="fr">Extraire et valider les relations complexes en sciences humaines : statistiques, motifs et règles d'association</title>
<author><name sortKey="Cadot, Martine" sort="Cadot, Martine" uniqKey="Cadot M" first="Martine" last="Cadot">Martine Cadot</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-31478" status="OLD"><idno type="IdRef">168612127</idno>
<idno type="ISNI">0000 0001 2193 4396</idno>
<idno type="RNSR">199613836L</idno>
<orgName>Laboratoire de Semio-Linguistique, Didactique et Informatique</orgName>
<orgName type="acronym">LASELDI</orgName>
<desc><address><addrLine>30 rue Mégevand 25030 Besançon cedex </addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr/pages/fr/ea-2281---laseldi-7966.html</ref>
</desc>
<listRelation><relation name="EA2281" active="#struct-242365" type="direct"></relation>
</listRelation>
<tutelles><tutelle name="EA2281" active="#struct-242365" type="direct"><org type="institution" xml:id="struct-242365" status="VALID"><idno type="IdRef">026403188</idno>
<idno type="ISNI">0000 0001 2188 3779 </idno>
<orgName>Université de Franche-Comté</orgName>
<orgName type="acronym">UFC</orgName>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName><settlement type="city" wicri:auto="siege">Besançon</settlement>
<region type="region" nuts="2">Franche-Comté</region>
</placeName>
<orgName type="university">Université de Franche-Comté</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Bourgogne Franche-Comté</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:tel-00594174</idno>
<idno type="halId">tel-00594174</idno>
<idno type="halUri">https://tel.archives-ouvertes.fr/tel-00594174</idno>
<idno type="url">https://tel.archives-ouvertes.fr/tel-00594174</idno>
<date when="2006-12-12">2006-12-12</date>
<idno type="wicri:Area/Hal/Corpus">002186</idno>
<idno type="wicri:Area/Hal/Curation">002186</idno>
<idno type="wicri:Area/Hal/Checkpoint">003E24</idno>
<idno type="wicri:explorRef" wicri:stream="Hal" wicri:step="Checkpoint">003E24</idno>
<idno type="wicri:Area/Main/Merge">005077</idno>
<idno type="wicri:Area/Main/Curation">004F12</idno>
<idno type="wicri:Area/Main/Exploration">004F12</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Extraction of Complex Relations in Humanistic : Statistics, Itemsets and Association Rules</title>
<title xml:lang="fr">Extraire et valider les relations complexes en sciences humaines : statistiques, motifs et règles d'association</title>
<author><name sortKey="Cadot, Martine" sort="Cadot, Martine" uniqKey="Cadot M" first="Martine" last="Cadot">Martine Cadot</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-31478" status="OLD"><idno type="IdRef">168612127</idno>
<idno type="ISNI">0000 0001 2193 4396</idno>
<idno type="RNSR">199613836L</idno>
<orgName>Laboratoire de Semio-Linguistique, Didactique et Informatique</orgName>
<orgName type="acronym">LASELDI</orgName>
<desc><address><addrLine>30 rue Mégevand 25030 Besançon cedex </addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr/pages/fr/ea-2281---laseldi-7966.html</ref>
</desc>
<listRelation><relation name="EA2281" active="#struct-242365" type="direct"></relation>
</listRelation>
<tutelles><tutelle name="EA2281" active="#struct-242365" type="direct"><org type="institution" xml:id="struct-242365" status="VALID"><idno type="IdRef">026403188</idno>
<idno type="ISNI">0000 0001 2188 3779 </idno>
<orgName>Université de Franche-Comté</orgName>
<orgName type="acronym">UFC</orgName>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName><settlement type="city" wicri:auto="siege">Besançon</settlement>
<region type="region" nuts="2">Franche-Comté</region>
</placeName>
<orgName type="university">Université de Franche-Comté</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Bourgogne Franche-Comté</orgName>
</affiliation>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="mix" xml:lang="en"><term>Association Rules</term>
<term>Data Cleaning and Preprocessing</term>
<term>Data Mining</term>
<term>Fuzzy Itemsets</term>
<term>Fuzzy rules</term>
<term>Itemsets</term>
<term>Knowledge Discovery</term>
<term>Machine Learning</term>
<term>Randomisation Test</term>
<term>Statistical Interaction</term>
<term>Statistical Significance</term>
<term>Text Mining</term>
</keywords>
<keywords scheme="mix" xml:lang="fr"><term>apprentissage artificiel</term>
<term>codage et recodage des données.</term>
<term>extraction de connaissances</term>
<term>fouille de données</term>
<term>fouille de textes</term>
<term>interaction statistique</term>
<term>motifs</term>
<term>motifs flous</term>
<term>nettoyage et prétraitement des données</term>
<term>règles d'association</term>
<term>règles floues</term>
<term>significativité statistique</term>
<term>test de randomisation</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">This thesis is about of Data Mining in Humanistic. This branch of Artificial Intelligence is a set of methods for extracting knowledge from electronic data. Among them, the itemsets and association rules extraction is a method to build a symbolic representation of the data structure, like the classical statistical methods makes, but, unlike these ones, it can work with complex and huge data. Therefore, this computer science model, obtained by counting of cooccurrences, is not easily used by scientists : it works with dichotomics data (True/False), the interpretation of its direct results is difficult, and its validity can seem of doubt for researchers working with statistics. We propose three techniques we constructed and experimented on real data to facilitate the use of the itemsets and association rules extraction by scientists : 1) With our randomisation test based on " exchanges in cascade " in the matrix subjects x properties, one can obtain the statistically significant links between properties 2) Our fuzzification of the itemsets and association rules extraction produces fuzzy association rules close to the fuzzy rules defined by researchers of fuzzy community around Zadeh 3) With our algorithm Midova one can only extract interactions, and 4) With our meta-rules, one can clean the association rules set of its principal contradictions and redundancies</div>
</front>
</TEI>
<affiliations><list><country><li>France</li>
</country>
<region><li>Franche-Comté</li>
</region>
<settlement><li>Besançon</li>
</settlement>
<orgName><li>Université de Bourgogne Franche-Comté</li>
<li>Université de Franche-Comté</li>
</orgName>
</list>
<tree><country name="France"><region name="Franche-Comté"><name sortKey="Cadot, Martine" sort="Cadot, Martine" uniqKey="Cadot M" first="Martine" last="Cadot">Martine Cadot</name>
</region>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 004F12 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 004F12 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Lorraine |area= InforLorV4 |flux= Main |étape= Exploration |type= RBID |clé= Hal:tel-00594174 |texte= Extraction of Complex Relations in Humanistic : Statistics, Itemsets and Association Rules }}
This area was generated with Dilib version V0.6.33. |